Spatial R Course
Ed Carnell & Phil Taylor

October 2019

1 Loading & installing packages

This script uses the following packages:

  • sf is used for processing vector data (points, polygons, lines)
  • raster is used for processing gridded data sets (rasters)
  • ggplot2 is used for producing static maps and graphs
  • leaflet is used for producing interactive maps
  • RColorBrewer provides a good range of colour schemes for maps

The following lines load these required packages using the function require(package_name).

N.B. Packages need to be installed before they are loaded, the function install.packages("package_name") is used to install new packages.

#install.packages("sf")
#install.packages("raster")
#install.packages("leaflet")
#install.packages("RColorBrewer")
#install.packages("ggmap")
#install.packages("leafpop")
require(ggplot2)
require(raster)
require(leaflet)
require(RColorBrewer)
require(sf)
require(ggmap)
require(leafpop)

2 Loading (non-spatial) data

2.1 Specifying file locations

To the load in the file sent to us by Sovaia, we need to specify the location of the file.

There are two ways to do this

  • You could set a ‘working directory’ and specify the directory in which all your files are stored. To set a working directory the setwd() function is used. Once set, R will assume all specified files are stored in your a working directory.

  • You can specify the location of each file as you load them. I prefer this method of working as I typically need to access files across multiple working directories

# Set shared project folder
prj_dir <- "C:/Link/To/Your/Data/Folder/"

2.2 Concatenating text strings

With the project folder specified, we can concatenate the file path and file name together using paste0() (This is the equivalent of the Excel function “Concatenate”). If you need to add in a separator between text strings the function paste() can be used. For example, if you wanted to add a space and comma between a list of names you’d use paste(c("Ed","Phil","Sovaia"),sep=", "), which would produce the result Ed, Phil, Sovaia

# Join file name to directory
fn <- paste0(prj_dir,"Points/Fiji_field_sites.csv")

# Print output
print(fn)
## [1] "C:/Link/To/Your/Data/Folder/Points/Fiji_field_sites.csv"

2.3 Reading in tables

The csv file sent to us by Sovaia can be loaded using read.csv(). For (very) large .csv files the fread() function in the package data.table is extremely useful (I don’t use that here as it’s less intuitive if you’re not familiar with data.table)

# Read in csv from Sovaia
csv <- read.csv(fn)

# Visually inspect the data file
csv
##    Site_ID  Site_Name Research_Activity Personnel x_coord y_coord
## 1        1         Ba              Goby         4 1888681 3934831
## 2        2       Lami             Skink        12 1963448 3880167
## 3        3   Korolevu          Hibiscus         2 1905213 3907123
## 4        4     Dakuni           Turtles         5 1937862 3847929
## 5        5      Navua             Skink        15 1938443 3869094
## 6        6       Suva                HQ        47 1966864 3874481
## 7        7     Malevu           Turtles        14 1838628 3983464
## 8        8      Tonge          Hibiscus         4 1859599 3966160
## 9        9 Koroyanitu             Skink         5 1877548 3929012
## 10      10     Voma A          Hibiscus         2 1932943 3892168
## 11      11     Voma B              Goby         4 1935578 3893486
## 12      12     Voma C             Skink         3 1935048 3889403
## 13      13     Vitawa           Turtles         8 1932105 3958886
## 14      14 Nayavutoka           Turtels         2 1963082 3939379
## 15      15   Sigatoka              Goby        11 1871212 3874765

2.4 Replacing text

You’ll notice that the spelling mistake of "Turtels" in the Research_Activity column is still there. To correct that we can use gsub(), this function as is the equivalent of “Find + Replace” in Microsoft Word & Excel

# Correct spelling error of 'Turtels'
csv$Research_Activity <- gsub(pattern = "Turtels",
                              replacement = "Turtles",
                              x = csv$Research_Activity)

2.5 Returning unique values

We can check that has worked by returning all the unique values in the column “Research_Activity”. The unique() function is used to return all unique values (numbers/strings).

# Check error is fixed
unique(csv$Research_Activity)
## [1] "Goby"     "Skink"    "Hibiscus" "Turtles"  "HQ"

With the spelling mistake corrected, we can now add spatial features to the data (using the coordinate columns)

3 Create a spatial object

3.1 data.frame to sf

To create a spatial object from data we’ve been sent, we need to convert the current data.frame into an sf object (sf stands for simple features and is a format used by R to handle spatial data).

To convert between the two object types we use the function st_as_sf(). The function st_as_sf requires:

  • x : an object to be converted into an sf object (in this example a data.frame)

  • coords : The names or numbers of the numeric columns holding coordinates (N.B. multiple coordinate can be supplied for polygon data, but this is a less common use of the function)

  • crs : coordinate reference system (in this case Fiji Map Grid - ESPG 3460)

# Convert data.frame to "simple features" object
# Coordinate Reference System = Fiji Map Grid (ESPG 3460)
sf_pnt <- st_as_sf(x = csv,
                   coords = c("x_coord","y_coord"),
                   crs = 3460)

# See what that looks like
sf_pnt
## Simple feature collection with 15 features and 4 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 1838628 ymin: 3847929 xmax: 1966864 ymax: 3983464
## epsg (SRID):    3460
## proj4string:    +proj=tmerc +lat_0=-17 +lon_0=178.75 +k=0.99985 +x_0=2000000 +y_0=4000000 +ellps=WGS72 +towgs84=0,0,4.5,0,0,0.554,0.2263 +units=m +no_defs
## First 10 features:
##    Site_ID  Site_Name Research_Activity Personnel                geometry
## 1        1         Ba              Goby         4 POINT (1888681 3934831)
## 2        2       Lami             Skink        12 POINT (1963448 3880167)
## 3        3   Korolevu          Hibiscus         2 POINT (1905213 3907123)
## 4        4     Dakuni           Turtles         5 POINT (1937862 3847929)
## 5        5      Navua             Skink        15 POINT (1938443 3869094)
## 6        6       Suva                HQ        47 POINT (1966864 3874481)
## 7        7     Malevu           Turtles        14 POINT (1838628 3983464)
## 8        8      Tonge          Hibiscus         4 POINT (1859599 3966160)
## 9        9 Koroyanitu             Skink         5 POINT (1877548 3929012)
## 10      10     Voma A          Hibiscus         2 POINT (1932943 3892168)

3.2 Changing coordinate systems

In order to compare the Fijian points to other data sets, we need to change the coordinate system from Fiji Map Grid. WGS84 is commonly used for global data sets

# Project coordinates to WGS84 (longitude, latitude)
sf_pnt_wgs84<-st_transform(x = sf_pnt,
                           crs = 4326)

# Check that's worked
sf_pnt_wgs84
## Simple feature collection with 15 features and 4 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 177.2335 ymin: -18.37327 xmax: 178.437 ymax: -17.14371
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## First 10 features:
##    Site_ID  Site_Name Research_Activity Personnel
## 1        1         Ba              Goby         4
## 2        2       Lami             Skink        12
## 3        3   Korolevu          Hibiscus         2
## 4        4     Dakuni           Turtles         5
## 5        5      Navua             Skink        15
## 6        6       Suva                HQ        47
## 7        7     Malevu           Turtles        14
## 8        8      Tonge          Hibiscus         4
## 9        9 Koroyanitu             Skink         5
## 10      10     Voma A          Hibiscus         2
##                      geometry
## 1  POINT (177.7013 -17.58611)
## 2  POINT (178.4048 -18.08256)
## 3  POINT (177.8558 -17.83724)
## 4  POINT (178.1621 -18.37327)
## 5  POINT (178.1682 -18.18204)
## 6     POINT (178.437 -18.134)
## 7  POINT (177.2335 -17.14371)
## 8  POINT (177.4294 -17.30143)
## 9  POINT (177.5961 -17.63811)
## 10  POINT (178.117 -17.97339)

You can see the values in geometry have changed and they are now in long/lat

3.3 Saving spatial data

Now that we have the points in long/lat, we can save this as a Shapefile or GeoPackage. We’ll save it in the same folder as the original prj_dir

# st write (used to save sf objects)
st_write(obj = sf_pnt_wgs84,
         dsn = paste0(prj_dir,"Points/fiji_site_locations_wgs84.shp"),
         delete_dsn = T)
## Deleting source `C:/Link/To/Your/Data/Folder/Points/fiji_site_locations_wgs84.shp' using driver `ESRI Shapefile'
## Writing layer `fiji_site_locations_wgs84' to data source `C:/Link/To/Your/Data/Folder/Points/fiji_site_locations_wgs84.shp' using driver `ESRI Shapefile'
## Writing 15 features with 4 fields and geometry type Point.

We can also quickly plot the newly projected data using the base plot() function

# Plot locations
plot(x = sf_pnt[,"Research_Activity"],
     pch = 16, # pch = spoint style & 16 = filled circle
     key.pos = 1) # posistion of legend. 1 = bottom

4 Producing maps with ggplot2

The base plot() function is useful for quickly checking data, but for reports etc ggplot() tends to produce nicer outputs.

4.1 Setting up a basic ggplot

ggplot() is really flexible as virtually every aspect of any graph can be edited. To start with, let’s create a black ggplot() object and add the projected points and colour these based on research activity.

#Create blank ggplot2 object
p <- ggplot()

#Add points
p <- p+geom_sf(data=sf_pnt_wgs84,
             aes(col=Research_Activity),
             size=2,
             show.legend = "point")

#show output
p

4.2 Adding in polygon data

It’s pretty hard to see where these points are located without having anything to compare them to.

Let’s load in the administrative boundaries and add these to the plot p

We can also specify the region we want to plot (using coord_sf) and change the colour scheme of the points (using scale_colour_brewer - a list of colour ramps can be found here, I’ve used Set1 as the data we are plotting is qualitative)

sf_poly <- st_read(dsn = paste0(prj_dir,"Polygons/fji_polbnda_adm2_province.shp"),
                   quiet=T)

sf_poly_wgs84 <- st_transform(x = sf_poly,
                              crs = 4326)

#Create blank ggplot2 object
p <- ggplot()

# Add polygon data
p <- p+geom_sf(data = sf_poly_wgs84,
             col = "black",
             fill = "green",
             alpha = 0.5)

# Add point data
p <- p + geom_sf(data = sf_pnt_wgs84,
                 aes(col = Research_Activity),
                 size = 3,
                 show.legend = "point")

# Zoom into area of interest
p <- p + coord_sf(xlim = c(177, 179),
                  ylim = c(-18.5, -17))

# Add colour scale
p <- p + scale_colour_brewer(palette = "Set1",
                             name = "Activity")
p

4.3 Adjusting themes

I still find the map above pretty ugly, but fortunately you can edit the theme() in ggplot() to your own tastes

You can specify any colour you like in R, by using hex colours. The below below is "#75cbe0". If you Google hex colour picker a handy tool comes up for picking colours.

# Create a new object, so as not to overwrite the plot above (p2)
# Make the grid lines even (0.5 x 0.5 degree)
p2 <- p + scale_x_continuous(breaks = seq(177,179,0.5),
                           expand = c(0,0),
                           limits = c(177,179))

p2 <- p2 + scale_y_continuous(breaks = seq(-18.5,17,0.5),
                            expand = c(0,0),
                            limits = c(-18.5,-17))

# Remove grey background
p2 <- p2 + theme(panel.background = element_rect(colour = "black",
                                                 fill = "#d9eefa"),

# Add grey gridlines
         panel.grid.major = element_line(colour = "grey"),

# Remove grey background in legend
          legend.key = element_blank(),

# Adjust legend text size
         legend.text = element_text(size=12,colour="black"),
         legend.title = element_text(size=14,colour="black",face="bold"))
p2

Or you could just make something garish, it’s up to you

# Remove grey background
p3 <- p2 + theme(panel.background = element_rect(colour = "#40ff00",
                                                 fill = "#ff00c8",
                                                 size = 4),
# Add grey gridlines
         panel.grid.major = element_line(colour = "#ff6f00"),

# Remove grey background in legend
         legend.key = element_rect(colour = "#40ff00",
                                   fill = "#ff00e1",
                                   size = 2),
# Adjust legend text size
         legend.text = element_text(size = 17,
                                    colour = "#ff001e",
                                    family = "sans"),

         legend.title = element_text(size=14,
                                     colour = "#ff3c00",
                                     face = "italic",
                                     family="serif"),

         axis.text = element_text(size=16,
                                  colour = "#8400ff",
                                  face = "bold",
                                  family = "mono"))
p3

5 Adding in base maps with ggmap

It’s Sometimes useful to add in base/background maps to check the verify the locations of your points/polygons or to produce nice looking maps for reports. In R we can download base maps using ggmap(). Unfortunately it’s now slightly complicated to add in Google base maps using ggmap as Google have added in the requirement for an API, which requires you to register and link your account to a credit-card. We can however, use third-party maps such as stamenmap

First we define a bounding box of the area we’re interested in (using lat/lon coordinates in WGS84). Then we plot the downloaded tiles using ggmap() and then add our point data in the same way as the ggplot() example.

# fetch basemap tiles of the study area
fiji <- get_stamenmap(bbox = c(left = 177,bottom = -18.5,
                               right = 179,top = -17),
                    zoom = 9,
                    maptype = "terrain")

# create a ggmap object using the downloaded tiles
p <- ggmap(fiji)

# Add points (as before in the ggplot example)
p <- p + geom_sf(data = sf_pnt_wgs84,
                 aes(col = Research_Activity),
                 size = 3,
                 show.legend = "point",
                 inherit.aes = FALSE)

p <- p + scale_colour_brewer(palette="Set1",
                             name="Activity")
# View outputs
p

6 Plotting using leaflet

Unlike QGIS, the above two examples are static maps.

Leaflet is a useful package that allows you to explore spatial data in a similar way to QGIS

6.1 Creating a basic leaflet map

First we need to create a leaflet object and add a base map (“Provider tile”). The %>% is called a pipe and is useful for linking functions together without having to assign the result each time

# Create leaflet object
lf <- leaflet() %>%
# Add satellite imagery
            addProviderTiles(provider = "Esri.WorldImagery",
                             layerId = "base") %>%
# Zoom into fiji
            setView(lng = 177.8, lat= -17.8, zoom = 8)

lf

6.2 Choice of base map in leaflet

There’s also a load of other base maps (a full list of which can be found here) and includes this revolting number:

lf %>%
  removeTiles(layerId = "base") %>%
  addProviderTiles(provider = "Stamen.Watercolor",
                   layerId = "base")
# or this ESRI relief layer
lf %>%
  removeTiles(layerId = "base") %>%
  addProviderTiles(provider = "Esri.WorldShadedRelief",
                   layerId = "base")

6.3 Adding point data to a leaflet

To colour the points in leaflet we need to create a colour palette that has all the research activities in. We do this by using unique() again and brewer.pal. Also with leaflet maps you need to specify that you would like to add a legend to the map.

# Return all unique Research Activities
activities <- unique(sf_pnt_wgs84$Research_Activity)

# Assign a colour for each Research Activity
cols <- brewer.pal(length(activities), "Set1")

#Create Leaflet colour palette
pal <- colorFactor(cols, domain = activities)

# Add to leaflet object
lf2 <-lf %>%
# Add markers for each activity
addCircleMarkers(data = sf_pnt_wgs84,
                 color = ~pal(Research_Activity),
                 stroke = FALSE,
                 fillOpacity = 0.7,
                 radius = 6) %>%

# Add a legend to the map
          addLegend(pal = pal,
                    title = "Activity",
                    values = sf_pnt_wgs84$Research_Activity,
                    layerId = "legend",
                    position = "bottomright")

lf2

6.4 Popups

We can also add in pop-ups of selected information when you click on the points. N.B. this popup = argument can be added to the addCircleMarkers above, but I thought I’d simplify the code a little

# Add to the latest leaflet object
lf3<-lf2 %>%

# Add markers for each activity
                addCircleMarkers(data = sf_pnt_wgs84,
                                 color = ~pal(Research_Activity),
                                 stroke = FALSE,
                                 fillOpacity = 0.7,
                                 radius = 6,
                                 popup = ~Site_Name)
lf3

7 Raster data

7.1 Introduction to raster data

Another common form of spatial data is gridded data (known as Raster data).

Raster data uses a regular grid of points to represent the data. This is data like a digital photo, where each pixel (or grid cell) is assigned a value (such as elevation or temperature). Since the grid is regular, the x and y coordinates do not need to be stored for each point. This makes it much quicker to work with compared with Point/Polygon/Line data, which is collectively known as vector data. Polygon data such as the Fiji boundary data, is typically larger and takes longer to visualise as it’s features can be irregular shapes and every feature has it’s own string of coordinates attached to it.

Ordnance Survey have written a more detailed comparison between vector and raster data can be found here

7.2 Loading raster data

We’ll load in some global altitude data at 1 km x 1km grid resolution

# load data
r <- raster(x = paste0(prj_dir,"Raster/elevation_1KMmd_GMTEDmd.tif"))

# plot
plot(r)

# return raster information
r
## class      : RasterLayer
## dimensions : 16800, 43200, 725760000  (nrow, ncol, ncell)
## resolution : 0.008333333, 0.008333333  (x, y)
## extent     : -180, 180, -56, 84  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
## source     : C:/Link/To/Your/Data/Folder/Raster/elevation_1KMmd_GMTEDmd.tif
## names      : elevation_1KMmd_GMTEDmd
## values     : -428.5, 8650.5  (min, max)

This file has 725,760,000 values and plots in seconds! Rasterizing data is really useful, especially when you’re working with global data or national data at a high spatial resolution. The CEH Land Cover data set for example has over 1 billion grid cells and provides land cover information for every 25 m x 25 m grid cell in the UK.

The summary information of the raster seems to make sense. The lowest grid cell has a value of 428.5 & the highest value is 8,650.5 m. Although Mt. Everest is 8,848 m high, gridded data only provides one value for each cell (here 1 km x 1 km) so often doesn’t capture extremes in data (in the same way that a spot-height would).

7.3 Cropping & aggregating raster data

Sometimes you may want to lower the spatial resolution of a raster. You may want to do this for the following reasons:

  • Regional vs Local: If you wanted to use the 25 m x 25 m CEH land cover data to produce a map of dominant land-covers in the UK, plotting the original would take a while and would be pointless as you’d be unable to distinguish between the values of > 1 billion cells. In this case it would make sense to increase the spatial resolution to 1 km x 1 km and visualise the most common (modal) value in the 1,600 individual (25 m x 25 m) cells that make up the 1 km x 1km grid.

  • Speed: Some operations do take a while to run over millions (or billions) of cells, so if you can get away with using a lower-resolution data it can speed things up

Alternatively as we’re only concerned with a tiny proportion of the global data, we can just crop the raster data to our Fijian study area. Rather than aggregating the resolution.

Let’s crop the raster data and compare the 1km elevation data to mean elevation at a grid resolution of 10 km x 10 km

#crop raster to study area
r_crp <- crop(x = r, y = extent(x = 177,
                                xmax = 179,
                                ymin = -18.5,
                                ymax = -17))

#plot 1km cropped data
plot(r_crp)

#Estimate mean elevation at 10km grid resolution
r_10km <- aggregate(x = r_crp,
                    fact = 10,
                    fun = mean)

#plot 10km cropped data
plot(r_10km)

7.4 Extracting raster values

We can also extract values of a raster using vector data.

For example, we could estimate the elevation at each of our study locations. We’ll use the original 1km raster as the aggregated one looks a bit coarse for our purposes

elv <- extract(r_crp,sf_pnt_wgs84)

# Add results to sf object

sf_pnt_wgs84$Elevation <- elv


sf_pnt_wgs84[,c("Site_Name","Research_Activity","Elevation")]
## Simple feature collection with 15 features and 3 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 177.2335 ymin: -18.37327 xmax: 178.437 ymax: -17.14371
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## First 10 features:
##     Site_Name Research_Activity Elevation                   geometry
## 1          Ba              Goby      12.0 POINT (177.7013 -17.58611)
## 2        Lami             Skink     142.5 POINT (178.4048 -18.08256)
## 3    Korolevu          Hibiscus     672.0 POINT (177.8558 -17.83724)
## 4      Dakuni           Turtles       0.0 POINT (178.1621 -18.37327)
## 5       Navua             Skink      27.5 POINT (178.1682 -18.18204)
## 6        Suva                HQ      48.5    POINT (178.437 -18.134)
## 7      Malevu           Turtles       0.0 POINT (177.2335 -17.14371)
## 8       Tonge          Hibiscus       0.0 POINT (177.4294 -17.30143)
## 9  Koroyanitu             Skink     524.5 POINT (177.5961 -17.63811)
## 10     Voma A          Hibiscus     483.0  POINT (178.117 -17.97339)

7.5 Verifying locations

The elevation height for Site_Name: Tonge (line 8) seems to be a bit odd. Hibiscus should be inland.

Let’s Google where the Tonge site is and check that it corresponds with a place called Tonge in Fiji

## Coordinates
st_coordinates(sf_pnt_wgs84[sf_pnt_wgs84$Site_Name=="Tonge",])
##          X         Y
## 1 177.4294 -17.30143

Google result

N.B. Google Maps assumes coordinates are in Lat/Long (y/x) rather than Long/Lat (x/y) that we currently have

Oh no! This research station appears to be in the sea, which seems like an odd location for Hibiscus.

Let’s replace the values with 177.740455, -17.628975 (the real location of Tonge, Fiji)

# Update coordinates
st_geometry(sf_pnt_wgs84[sf_pnt_wgs84$Site_Name=="Tonge",]) <-  st_sfc(st_point(c(177.740455, -17.628975)))

# Repeat extraction of elevation
sf_pnt_wgs84$Elevation <- extract(r_crp,sf_pnt_wgs84)


sf_pnt_wgs84[sf_pnt_wgs84$Site_Name=="Tonge",c("Site_Name","Research_Activity","Elevation")]
## Simple feature collection with 1 feature and 3 fields
## geometry type:  POINT
## dimension:      XY
## bbox:           xmin: 177.7405 ymin: -17.62898 xmax: 177.7405 ymax: -17.62898
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
##   Site_Name Research_Activity Elevation                   geometry
## 8     Tonge          Hibiscus        22 POINT (177.7405 -17.62898)
# raster palette
r_pal <- colorNumeric(c("#0C2C84", "#41B6C4", "#FFFFCC"), values(r_crp),
  na.color = "transparent")

# Create leaflet object
leaflet() %>%
# Add satellite imagery
            addProviderTiles(provider = "OpenStreetMap.Mapnik",
                             layerId = "xx") %>%
# Zoom into fiji
            setView(lng = 177.8, lat= -17.8, zoom = 8) %>%

# Add a raster to the map
          addRasterImage(x = r_crp,
                         colors = r_pal,
                         opacity = 0.5) %>%
          addLegend(pal = r_pal,
                    values = values(r_crp),
                    position = "bottomright",
                    title = "Elevation") %>%

# Add markers for each activity
addCircleMarkers(data = sf_pnt_wgs84,
                 color = ~pal(Research_Activity),
                 stroke = FALSE,
                 fillOpacity = 0.5,
                 radius = 6,
                 label = ~Site_Name,
          popup = popupTable(x = sf_pnt_wgs84%>%st_set_geometry(NULL),
                                    #Remove geometry from sf object (it looks ugly otherwise)
                                    feature.id=F,
                                    row.numbers = F)) %>%

# Add a legend to the map
          addLegend(pal = pal,
                    title = "Activity",
                    values = sf_pnt_wgs84$Research_Activity,
                    layerId = "legend",
                    position = "topleft")